A filter feature selection for high-dimensional data
نویسندگان
چکیده
In a classification problem, before building prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, best solution use feature selection. this article, we propose new filter method for selection, by combining Relief algorithm multi-criteria decision-making called TOPSIS (Technique Order Preference Similarity Ideal Solution), modeled selection task as decision problem. Exploiting methodology, matrix computed delivered Technique Solution in order rank features. The proposed ends up giving ranking from mediocre. evaluate performances suggested approach, simulation study including set experiments case studies was conducted on three synthetic dataset scenarios. Finally, obtained results approve effectiveness our detect
منابع مشابه
Feature Selection for High Dimensional Data: An Evolutionary Filter Approach
Problem statement: Feature selection is a task of crucial importance for the application of machine learning in various domains. In addition, the recent increase of data dimensionality poses a severe challenge to many existing feature selection approaches with respect to efficiency and effectiveness. As an example, genetic algorithm is an effective search algorithm that lends itself directly to...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملFeature Selection for High-dimensional Integrated Data
Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of feature selection in which only a subset of the predictors Xt are dependent on the multidimensional variate Y , and the remainder of the predictors constitute a “noise set” Xu independent of Y . Using Monte Carlo simulations, we investigated the relative perfor...
متن کاملFeature selection for high-dimensional industrial data
In the semiconductor industry the number of circuits per chip is still drastically increasing. This fact and strong competition lead to the particular importance of quality control and quality assurance. As a result a vast amount of data is recorded during the fabrication process, which is very complex in structure and massively affected by noise. The evaluation of this data is a vital task to ...
متن کاملFeature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution
Feature selection, as a preprocessing step to machine learning, has been effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this work, we introduce a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Algorithms & Computational Technology
سال: 2023
ISSN: ['1748-3018', '1748-3026']
DOI: https://doi.org/10.1177/17483026231184171